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by Mar k S. Young 

Today's disk subsystems provide large amounts of 
memory and fast access with minimal cost, at levels 
designers could only dream about a few years ago. 
Continual increases in track and linear bit density— 
and, as a result, in transfer speeds— combine with 
decreases in access times to improve performance of 
dis'. products. Disk controllers and operating systems 
has . also enjoyed steady gains that increase speed. 
Improvements in processor speeds, however, have led 
system integrators to demand still higher throughputs. 

Yet, several design techniques can still yield better 
overall data throughput. Typical disk drives require 
between 90 and 30 ms for a seek operation. Since 
the read/write head positioning, or seek, consumes 
the most time in a disk I/O operation, controller and 
operating system designers have focused their efforts 
ther. . 

In a multidrive system, controller design can 
change the sequential series of disk access operations 
into parallel ones. A typical 5 Va -in . Winchester, using 
" Tfie ST506" interface, cam" a ccept he ad movem ent 
pulses about once ^ every J0_ms"(T pulse mean^o_Tnove_ 




the read/write heads from one track to the next). A 
read/write head requires up to 3 ms to move from 
track to track, depending on how far it has to move. 
Most small Winchester disks do not require selection 
while read/write head positioning is taking place. 
After the controller issues a complete head movement 
command, therefore, it can turn its attention to 
another drive while the positioning operation occurs. 

" Optimizing seek operations 
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Stacked command and linked-Hst structures support 
elevator sorting to optimize data access through seek 
overlapped operations (a). Stacked command requires 
operating system intervention and extra CPU 
overhead and memory. Linked list is a more flexible 
structure that simplifies the implementation by 
providing link pointers (b). 



I0PB NO. 1 
DRIVE 0 



IOPB NO. 2 
DRIVE 0 



I0PB NO. 3 
DRIVE 2 



I0PB NO. 5 
DRIVE 1 



IOPB NO. 4 
DRIVE 2 



IOPB NO. 6 
DRIVE 3 



DISK 
CONTROLLER 



(W 



then processes another drive's seek operations while 
waiting for the current drive's command to be com- 
pleted. Average access time to all the disks is substan- 
tially improved. In a four-drive system, this could 
be a gain of as much as 400 percent. 

The success of seek-overlap operations, as these 
parallel seek operations are called, depends on ran- 
dom disk file request. If the computer's operating 
system tends to access files on a single disk consis- 
tently (in a multidrive system), then seek-overlap has 
little advantage. Maximum benefit is gained when 
files are stored on a randomly chosen disk. In some 
Unix operating systems, for example, the disk file 
manager randomly selects a disk on which to store 
a newly created file. This ensures the disk controller 
will receive I/O access requests that are distributed 
among the various drives. 

Since files are not usually accessed in a truly ran- 
dom manner, the randomizing technique may re- 



that move steadily toward the disk spindle (towan 
the inner tracks) and those that move steadily awj 
from the spindle (towards the outer tracks). In th 
grouping, each successive access is in the same dire 
tion. The read/ write heads move first in one dire 
tion and then the other, accessing requested data 
they go. This eliminates back and forth motions 
the head. The sorting function generally is handli 
by either the disk file controller or the compute 
operating system. 

A typical series of disk seek requests, if execufc 
as received, might require a total seek distance 
some 556 tracks. Sorting the incoming requ< 
numerically could reduce this to about 283 tracks 
50 percent reduction in seek distance and time^ 
File dependencies, however, mean full perfoj 
mance benefits cannot always be gained from op] 
mal elevator sorting. If one request requires a 
operation on a track and a later operation re Q uil ^ 
a read, the elevator sort must maintain the correXf 



quire adjustments. By keeping track of all file 
^accessesoyer^ 
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commands long before they will be needed. The 
stacked command structure is a commonly used tech- 
nique that supports these functions. An area in sys- 
tem memory is used to store disk I/O commands, 
usually one command per drive. The controller can 
easily initiate a primitive seek-overlap algorithm by 
allowing up to four disks to have commands pend- 
ing. Elevator sorting must be performed in the file 
manager or a separate disk I/O processor. To guaran- 
tee continuous use of the controller, the file manager 
must also update the command structure whenever 
a disk is free. A disadvantage of the stacked com- 
mand structure is that operating system intervention 
is required each time a disk I/O command is com- 
pleted. Thus the controller would introduce extra 
CPU overhead. 

The linked list command chain offers a more flex- 
ible command structure. A disk command chain, cre- 
ated using disk command blocks (I/O parameter 
blocks), is placed either in consecutive memory loca- 
tions or is strung together as a linked-list data struc- 
ture. Although this structure imposes some extra 
overhead, such as pointers to commands in the list, 
it'maximizes system flexibility when integrating the 
disk controller with the operating system. 

Zero-interleaved data transfers 
improve data I/O operations^ as 
well as speed disk access times. 

The operating system creates the command struc- 
ture and the disk controller executes it. The linked- 
list structure allows IOPBs to be placed in the 
command chain well before execution without impos- 
ing size limitations. Since a queue of disk I/O com- 
mands is available to the disk controller, it simplifies 
the implementation of a seek-overlap function in the 
disk controller. 

Moreover, it allows the disk controller to start 
searching the command list for the command that 
starts the next seek as soon as one I/O operation 
finishes. This keeps the drive constantly busy with- 
out system intervention. The linked-tist structure also 
speeds execution of the elevator access algorithm. 
Because the operating system can create the com- 
mands and then sort the seeks without moving com- 
mands around in system memory, only the link 
pointers have to be altered. 
-i^-EMslrf/O^operationsxan" aIs6T>e improved By~56'rf " 
^inuoiisry^ansferingUata to and rrom a trackfTrusj 
jjieth q^j^^ 
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amounts of physical space between logically adja( 
track sectors. It also provides time between proci 
ing of the data to (or from) the disk and the arnh 
of the next desired sector on the track. 

Data buffering and high speed control capabi 
can eliminate the need for interleaving. One meth< 
uses first in, first out buffers to allow zero-interl 
leaved operations. FIFOs, however, are limited 
size and allow underflow/overflow. A number 
simple, low performance system FIFOs in varic 
sizes (8, 32, or 128 bytes) are used in VLSI controlli 
to minimize hardware costs, while still allowing zen 
interleaved transfers. 

Alternate data transfer methods 

Another method uses a pair of sector or tog 
buffers. This allows disk data to fill or empty one 
buffer while the system empties or fills the other. 
Continuous data transfer is provided by switching 
buffers at the correct time. System memory can be 
completely decoupled from the disk drives, since afl 
disk data from one transfer is contained in one o| 
two buffers. Toggle buffers also prevent data unda| 
flows/overflows and allow on-controller correction 
of data errors. % 

A dual-ported RAM can also implement the bu£ , 
fering needed to perform zero-interleaved operation^ ' 
These RAMs are expensive, and require close coord| 
nation of timing control and addressing. A tracE 
buffer can hold all data from a single disk trade 
This RAM, however, should be designed as a vari- 
ation of the dual-ported RAM to ensure continuous" 
throughput between the disk and the system. | 

Whatever method is used, the zero-interleaved 
operation significantly improves disk access times,! 
A typical Winchester disk rotates at 3600 rpm, r^ 
quiring about 16.6 ms per revolution. With a con-' 
troiler interleaving of degree five on each track 
(logically adjacent sectors are physically separated 
by five sectors), at least six revolutions (or 100 msj 
are required to read or write data to an entire tracks 
This actual data transfer time, when added to the 
time required to move the read/ write heads to a track 
(80 to 90 ms), results in a total I/O access time that 
is twice that using zero interleaving. ~ 

A variation of zero-interleaved transfers, a teci 
nique called "nonsequential" sector access, furthtf 
boosts overall performance. Data is stored on tfe 
disk tracks in blocks called sectors. These sectors ai£ 
jogically accessed in a sequential .manner. foL all_ m 

_sector numbered. A, then.A + l^A.^2, .etc). Anj 
rrial controller- logicaUy accesses- several consectrd^ 



1 ^any^com roiler^es^ sectors on "a track 

~to. be interleaved, interleaving calls for specific to) subsequent sequential sectors 
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Statistically, however, the controller will miss the 
desired first sector half of the time. This means that 
about half the amount of time a rotation takes must 
be added to the "access" time of every new track 
operation. This delay, about 8 ms on 5 1/4-in. Win- 
chesters, is called "rotational latency" time. On fast 
disk drives (with average access time of 30 ms), rota- 
tional latency can increase the average access time 
of the disk by about 15 percent (ie, 30 ms for seeking, 
16 ms for read/write, and 8 ms for rotational latency). 

The nonsequential sector access method minimizes 
rotational latency by accessing the requested sectors 
in whatever order they are found. As soon as any 
desired sector (A, A+ 1, A + 2, etc) is located, the 
disk controller performs the necessary I/O operation 
on that sector. Thus, the maximum time necessary 
to access any amount of data is no more than that 
required for one rotation of the disk plus one sector. 
The nonsequential access method is implemented by 
using a track buffer to hold the contents of the track 
during I/O operations. The controller, however, re- 
quires extra hardware for the logic necessary to 
recognize when a desired sector passes under the 
read/write heads. 



Boosting disk performance 

A disk data cache provides an increasingly com- 
mon method of boosting disk performance. Disk 
cache functions are much like those of caches used 
for CPUs. As data is read or written from or to the 
disks, copies are stored in the cache. If the operating 
system later requests the same data, the data can be 
taken from cache instead of from disk. Disk caching 
boosts disk I/O performance from two to nine times 
because head positioning delays are not involved. 
Since cache is not disk dependent, slow, low cost 
disks can be used with a cache controller to give per- 
formance approaching that of expensive, high perfor- 
mance drives. 

Disk caching can be done either by using the oper- 
ating system, or by building a dedicated cache con- 
troller. Having the operating system handle cache 
requires cache management software and reduces 
system memory space (since it must be allocated to 
the cache buffers). The other popular method of 
implementing disk caching is through dedicated 
cache controller hardware. This involves adding 
RAM, a cache manager (usually a microprocessor 
and ROM), software, some form of interface to the 
jpggrAtjng. system 
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Cmain file organizations can improve disk access 
times. Contiguous allocation allows file data to be 
stored compactly and retrieved quickly on the disk, 
bat suffers from space allocation and garbage 
collection problems (a). Linked-list allocation 
overcomes contiguous allocation's disadvantages, but 
intr Lices extra overhead, speed problems, and 
▼til, ability to file damage. Moreover, it does not 
alio* random file data access (b). The indexed 
allocation scheme, widely used in Unix systems, 
overcomes most of the problems in a and b, but has 
some speed trouble, extra complexity, and can 
impose file size limits (c). 



read/write head to another usually requires only 
microseconds, while changing tracks requires sev- 
eral milliseconds. 

A recurring problem with many disk-based operat- 
ing systems is file fragmentation, which occurs when 
files are updated (ie, increased or decreased in size). 
One reason for the popularity of the Berkeley ver- 
sion of Unix (4.2 BSD) is its use of an improved file 
management system to increase file access speed and 
reduce fragmentation. 

Files are stored on disks using either linear (con- 
tinuous), linked-list allocation, or indexed allocation. 
The linear method stores files in single, contiguous 
blocks on the disk, minimizing the time needed to 
read the files. To increase or decrease file size, data 
must be moved to another area on the disk that is 
equal to or greater than the new file. Because files 
are allocated randomly, free space is broken into a 
large number of pieces. Frequent compaction is re- 
quired to maximize the disk's storage capacity, 
usually at a great cost in disk processing time. 

The linked-list file structure incorporates a pointer 
in each sector to indicate the next sector belonging 
to the file. Fragmentation is avoided because every 
sector on the disk can be used in disk files. Despite 
the advantages of using complete disk space and the 
flexibility in allocating space, the linked structure re- 
quires extra overhead in each sector for the pointers. 
The structure is also more vulnerable to damage; if 
the link in one sector is damaged, the rest of the file 
is lost. The operating system's inability to make 
direct accesses into the file is also a shortcoming. 

The indexed allocation method uses a pointer table 
built into the beginning of the sector to define all sec- 
tors in the file. This method, like the linked list, 
minimizes fragmentation on the disk because all sec- 
tors can be used and direct access is allowed. The 
space required for the file pointer table, however, is 
usually fixed at file creation and must serve for the 
life of the file. Since estimating the table size is usually 
difficult, overestimation is the rule, costing storage 
space. The space required for the table frequently ex- 
ceeds that used for the linked-list method. And unless 
the index tables are kept in the controller, the ele- 
vator sort algorithm cannot be properly exploited. 

Programmable features in modern disk controllers 
combine the best features of all three file storage 
methods while minimizing their disadvantages. Keep- 



ing the different file techniques distinct from one 
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But it still allows sectors to be added without result- 
ing in fragmentation or excessive file movement. 

Ondisk file caching also boosts overall disk/system 
throughput. This method stores frequently used files 



in the most rapidly accessible part of the disk, 
mally, files are stored on the disk randomly (b 
on such things as space requirements and partiti 
ing of the disk data space). 



Second-generation VLSI disk controllers 



A primary requirement of second generation VLSI 
disk controllers is maximum disk and system per- 
formance. The Am9580 hard disk controller's dual- 
buffer architecture with integrated DMA controller 
boosts disk system performance in two ways. First, 
it fully decouples the disk serial interface (up to 16 
Mbits/s) from the system bus. This allows the system 
to operate without being tied into the peculiarities 
of the disk timings. Second, it provides efficient 
data transfers and cuts software overhead. The 
DMA controller can transfer data at rates up to 5 
Mbytes/s and supports 8- or 16-bit interfaces. Pro- 
grammable bus throttling regulates bus activity of 
the hard disk controller on the system bus. 

Another facet of second generation disk con- 
trollers is the large amount of integrated software. 
The Am9580 has two different microengines operat- 
ing in parallel and executing specialized disk data 
control algorithms. One of the microengines, the 
data format controller, handles all the serial transfer 
of data to and from the disk. 

In addition to performing track formatting and 
sector reads and writes, it handles special data 
recovery algorithms, and floppy or Winchester disk 
formats. The second microcontroller, the command 
sequencer, is responsible for interpreting the 16 
different disk commands, coordinating DMA activ- 
ity, and handljng the disk control interface. Com- 
mand, status, and data transfers are handled by the 



controller with minimal CPU intervention. The hard 
disk controller uses a linked-list command struc- 
ture. Command blocks (called I/O parameter blocks) 
are set up as a linked list in system memory. The 
disk controller automatically fetches and executes 
the commands without additional CPU overhead. 
Data transfers between the disk and system are 
handled by the onboard DMA controller. An addi- 
tional control structure, called data map, allows the 
Am9580 to break up or combine data scattered 
throughout system memory and collect it into con- 
tiguous blocks. 

Status result information is stored in a specially 
designated area of system memory called the sta- 
tus result area. Errors resulting from abnormal disk 
behavior are reported to the system in the status 
result area along with an identification code that 
describes which command caused the error. Users 
can program how the controller will handle errors 
and when it will abort a command. 

Finally, the Am9580 supports several different 
disk drive control interfaces. The industry standard 
ST506 Winchester interface is fully implemented. 
For floppy drives, a floppy-like control interface is 
supported. To accommodate custom disk inter- 
faces, a special programmable option allows differ- 
ent portions of the disk control interface to be 
selectively disabled. This allows the user to imple- 
ment (externally) any desired control interface. 
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jtimal disk access is achieved, however, by cen- 
read/write heads around the middle tracks on 
alter because the average distance to any loca- 
^suming random disk I/O requests) is mini- 
u . To ensure this happens, the disk file manager 
k, track of how frequently different files on 
nisi, -ire accessed. Then the file manager must 
fiance the files so those most frequently accessed 
^located on the center tracks. In paged virtual 
Biory S y S tems, spare data blocks allocated on the 
ler tracks should be used to accommodate data 
t is constantly being swapped in and out of sys- 
memory. 

^ndisk caching can be put into the operating sys- 
ffs normal disk I/O routines without degrading 
Jeir - oughput. During medium and high system 
ad -ds, the operating system should only sta- 
tically crack disk I/O. Periods of low use (about 
J percent or less) can then be used by the disk sort- 
routines to move files around (based on current 
listics) and to clean up fragmented files and put 
m into. linear blocks. These routines would move 
_st recently used files to outer or inner tracks, and 
jigrate more frequently used files to center tracks. 
^Special options would allow the operating system 
sigr o force certain files to middle tracks or pre- 



vent the ondisk caching routines from moving them. 
Since many systems have long periods of low load- 
ing, daily updates of the ondisk cache would not de- 
grade users' response time. Thus, overall throughput 
of the disk drives would increase without any system 
overhead that is visible to the user. 

Disk controllers employing these techniques are 
not simple systems based on one or two VLSI chips. 
Instead, the disk controller is built up around power- 
ful disk controller ICs, such as the Am9580 con- 
troller and the Am9581 disk data separator, and 
supported by a microprocessor, random logic, RAM 
(large amounts for caching), and ROM to contain 
all the necessary software algorithms. Although 
single-user, single-tasking microcomputers do use 
these techniques, newer machines and Unix-based 
systems require very high performance disk systems 
to perform adequately in most multi-user, multi- 
tasking environments. 
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